Tool(s):
We used the Jigsaw visual analytics system developed here in our group
at Georgia Tech to work on the problem. Jigsaw is an analysis
system to help people working with document collections. It's
been in development over the last five years. More
about the system can be found at
http://www.gvu.gatech.edu/ii/jigsaw.
Video:
Video
ANSWERS:
------------------------------------------------------------------------
MC 3.1 Potential Threats: Identify any imminent terrorist threats in
the Vastopolis metropolitan area. Provide detailed information on the
threat or threats (e.g. who, what, where, when, and how) so that
officials can conduct counterintelligence activities. Also, provide a
list of the evidential documents supporting your answer.
Process Description
Our analysis process included three key components: data import, data
cleaning and annotation, and the actual analysis. These three phases
were not strictly separated but intertwined throughout the entire
process. For instance, we continued data cleaning while doing
analysis.
Data Import
First we had to import the initial files into Jigsaw. We added some
code to Jigsaw so that the system would parse each document's text and
separate it into sections for the title, date, and article body. This
allowed us to read in all the documents into the system. Next, we ran
the entity identification process within Jigsaw to find relevant
people, organizations, locations, etc. within the documents.
Data Cleaning and Annotation
The entity extraction process produced too many entities (e.g. 21,866
people and 19,184 organizations) to be manageable, including many
false positives. To reduce the number of entities we added code to
Jigsaw to remove entities only occurring in one or two documents
within the system. In general, we thought that such entities were
likely either errors or not important to a central plot. If we
subsequently found one of these entities to be important, we could add
it back in later. This process decreased the number of person
entities by almost a factor of 10, resulting in a much more manageable
set. We then did further manual cleaning of the data set by removing
or correcting wrongly identified entities and adding entities that
were missed in the identification process. This process took about a
week, working for a few hours per day. Finally, we ran Jigsaw's
computational text analyses of the documents to compute summary
sentences for each document, similarities across documents, and
clusters of related documents.
Investigative Analysis
We then began investigating the documents in more depth. We primarily
used the List View (Figure 1) within Jigsaw to explore the different
entities in the collection. Lists of entities (by type) can be sorted
alphabetically or by the number of documents in which they appear.
Selecting entities in the view highlights connected (related)
entities. At the same time, we explored sets of documents that were
put together into groups within the Cluster View in Jigsaw (Figure 2.)
We would identify potential interesting entities and documents via
these views and would load the relevant documents into Jigsaw's
Document View (Figure 3) for more detailed analysis. This initial
investigation did not turn up many good leads. The most common
entities from the List View did not appear to be involved in any
suspicious activities. Similarly, the clustering did not produce
helpful sets of documents.
Figure 1: List View showing connections from Vastopolis to other entities.
Figure 2: Document Cluster View showing sets of related documents.
Figure 3: Document View showing containing a number of the documents crucial to the plot, with 03212 in focus.
By reading many documents in this way, we did begin to notice trends and threads of potential terror/criminal plots, however. We also did a search on relevant terms and we examined the sets of resulting documents Jigsaw loaded for them. This began to give us ideas about what might be going on. It seemed that the majority of documents in the collection were slightly modified former news articles from the late '90's. They didn't seem related to the plot. A smaller number of typically shorter documents involved recent activities at Vastopolis were suspicious, however.
At this point, we decided to examine all of the documents to find ones fitting this pattern. We used Jigsaw's Document View to load all the documents and do a rapid triage-style pass through them, looking for documents meeting this suspicious profile. The Document View allows a person to do this very quickly and we were able to go through the entire collection in about 3-4 hours. This process gave us approximately 60 "suspicious" documents to examine in more detail. We loaded all of these documents into Jigsaw's Calendar View (Figure 4) to see their timing and we read their contents more closely. We used the Tablet window (Figure 5) of Jigsaw to take notes, create timelines, and gather our thoughts about the plot.
Figure 4: Calendar View showing the dates of the small set of suspicious documents.
Figure 5: Tablet window showing our analysis notes including a
timeline and the relevant terrorist groups and individuals.
Solution
It appears that terrorists are planning a bioterrorist attack on
Vastopolis. The two terrorist organizations to watch are the
Paramurderers of Chaos and the Forever Brotherhood of Antarctica. We
believe that the terrorists likely will attack the food and water
supply either through putting something into the water or
contaminating food in food processing plants. We believe that
terrorists will use a biological agent such as a spore. The materials
for doing this may have been stolen from the lab of Prof. Edward
Patino at VAST University. The recent animal deaths both in fields
around Vastopolis and in the river raise suspicions about
contaminants/poisons/diseases being used. The Citizens for Ethical
Treatment of Lab Mice organization has threatened city officials
before, and they have been linked to the Forever Brotherhood.
We started to suspect the Forever Brotherhood of Antarctica and the Paramurderers of Chaos because they are linked to bioterror threats. We think that the death of the fish and the animals are related to actions by Paramurderers of Chaos because the police confiscated their lab equipment and the fact that suspicious people had been seen in the areas where the animals started dying. Also, the mayor was a victim of a dognapping done by the terrorist organization Forever Brotherhood of Antarctica.
Relevant articles: 00008, 00878, 01038, 01482, 01785, 01878, 02385, 03040, 03212, 03237, 03295, 03435, 03662, 03740, 04085, 04314